        IDA SDK - Interactive Disassembler Module SDK
        =============================================

        This SDK should be used with IDA kernel version 4.1.4

        This package allows you to write:
                - processor modules
                - input file loader modules
                - plugin modules
                
        You MUST read through whole file - many clues, hints and
        vital (to write your own modules) information is here.

-----------------------------------------------

        What you need:

To create 32bit MS DOS modules:         Watcom 10.0b and make.exe from Borland
To create 32bit OS/2 modules:           Watcom 10.0b and make.exe from Borland
To create 32bit Win32 modules:          Borland C++ Builder v4.0 or free BCC v5.5 

I don't use wmake.exe from Watcom, its format is very strange.
All source files are the same for all platforms and are compiled using the
same makefile. You need to specify target in the command line:

        make -D__WATCOMC__ -D__MSDOS__                  -- for 32 bit MSDOS
        make -D__WATCOMC__ -U__MSDOS__                  -- for 32 bit OS/2
        make -D__NT__      -U__MSDOS__                  -- for 32 bit WIN32

32 bit MSDOS or OS/2 version can be created only by Watcom running
under MSDOS or OS/2.
32 bit WIN32 version can be created only by Borland C++ CBuilder v4.0
Probably the old BCC v5.2 will work too, but I didn't check it.
I didn't test other Watcom versions, maybe you can use higher versions too.

It's good idea to create command files to invoke make, since the command
line gets long...
See examples in MODULE\*.bat

-----------------------------------------------

        Installation:

0. Unzip the archive with -d switch:

        pkunzip -d idasdk.zip

1. Edit files
        allmake.mak             modify pathes to your compilers
        *.cfg                   compiler configuration files
        
2. Set environment variable IDA to point to directory with the SDK. There must
   be file named allmake.mak in this directory:

        set IDA=c:\idasdk\

   (you may use *.BAT files from the distributive for compilation,
   just don't forget to edit them too)

3. Add compiler directory to PATH because the compiler and other utilities
   will be called from makefiles w/o path.
   Copy BIN\PEUTIL.EXE to some directory in the PATH.

4. Compile and build utilities in ETC directory. See ETC\README file.

        That's all.

-----------------------------------------------

        Directories in the SDK:


INCLUDE         - header file
MODULE/EXAMPLE          - a real example of a disassembler module (intel 8051)
MODULE/EXAMPLE/WAT.OS2  - output directory for OS/2 object files
MODULE/EXAMPLE/WAT.D32  - output directory for MSDOS 32bit object files
MODULE/EXAMPLE/BOR.W32  - output directory for WIN32 object files
LDR/SAMPLE              - a real example of a loader module (w32run)
LIBWAT.D32      - 32bit libraries for MSDOS
LIBWAT.OS2      - libraries for OS/2
LIBBOR.W32      - libraries for WIN32
BIN             - directory with SDK utilities
                  DOS32 result files will appear here
BIN/OS2         - ditto for OS/2
BIN/W32         - ditto for Win32
D32WAT.CFG      - Compiler configuration file for Watcom, target 32bit MSDOS
                  Don't forget to edit it (modify directories)!
OS2WAT.CFG      - Compiler configuration file for Watcom, target 32bit OS/2
                  Don't forget to edit it (modify directories)!
W32BOR.CFG      - Compiler configuration file for BCC 5.02, target WIN32
                  Don't forget to edit it (modify directories)!
ALLMAKE.MAK    - 'make' configuration file. You should edit pathes to
                  compilers here.
README          - this file.

Directory include should be accessible as $(IDA)include
Directory lib     should be accessible as $(IDA)lib
Directory bin     should be accessible as $(IDA)bin
Files *.cfg and allmake.mak should be in directory $(IDA)

Other files (including sources of your module) can be located anywere.

-----------------------------------------------

        A quick tour on header files:

AREA     HPP            class 'area'. This class is a base class for
                        'Segment' and 'SrArea' (segment register) classes.
                        This class keeps information about various areas
                        of the disassembled file.
AREG     HPP            abstract registers. actually this class isn't used
                        anywhere except intel 80x86 processors.
                        just don't use it.
KERNWIN  HPP            various functions to interact with the user.
                        The disassembler module can use these functions at
                        the initial file loading time.
                        Also, some functions to process strings are kept in
                        this header.
AUTO     HPP            auto-analysis control.
                        You can use autoMark(ea,AU_CODE) function to plan
                        the conversion of bytes at address 'ea' to
                        instructions.
BYTES    HPP            Functions and definitions which describe each byte
                        of the disassembled program: is it an instruction,
                        data, operand types etc.
DISKIO   HPP            file i/o functions. They may be used by IDP
                        if it have 'loader' function defined, i.e. IDP
                        itself loads the input file, not the kernel.
                        Almost all functions check errors and produce
                        fatal error dialog box if something goes wrong.
                        See file pro.h and fpro.h for additional system functions
ENTRY    HPP            List of entry points to the program being
                        disassembled.
FIXUP    HPP            information about relocation table of the program.
FPRO     H              Alternative set of system-indenendent file i/o
                        functions. These functions do check errors but never
                        exit even if an error occurs. They return extended
                        error code in qerrno variable.
                        You must use these function, not functions from
                        stdio.h
FUNCS    HPP            Functions  in the disassembled program
FRAME    HPP            Local variables, stack trace
IEEE     H              IEEE floating point functions.
STRUCT   HPP            Structures in the disassembled program
ENUM     HPP            Enums in the disassembled program
HELP     H              Help subsystem. This subsystem is not used in
                        IDP files. I put it just in case.
IDA      HPP            the 'main' header file of IDA project.
                        actually it is not :), although this file is included
                        in all source files. Here the 'inf' structure is
                        defined: it keeps all parameters of the disassembled
                        file, including parameters like 'number of
                        xrefs to display'. This structure is saved into the
                        database. IDP shouldn't change this structure except
                        at the loading time.
IDP      HPP            the 'main' header file of IDP modules.
                        2 structures are described here:
                          processor_t - description of processor
                          asm_t       - description of assembler
                        Each IDP has one processor_t and many asm_t structures.
INTS     HPP            predefined comments. shouldn't be used right now.
LINES    HPP            generation of source (assembler) lines and long
                        comment lines. variables controlling the exact time
                        and place to generate xrefs, indented comments etc.
                        shouldn't be used in simple IDP modules.
NAME     HPP            names: rename, unname bytes etc.
NETNODE  HPP            the lowest level of access to the database. IDP module
                        can use this level to keep some private inforation
                        in the database. Here is short description of
                        the concept:
                          the database consists of 'netnodes'. Netnodes
                          can be linked using 'netlinks'. Each netnode has an
                          internal number and may have:
                            - a name (max length is MAXNAMESIZE-1)
                            - a value (a string)
                            - sparse arrays of values:
                              Each sparse array has a tag (char). Therefore,
                              we can have 256 sparse arrays in one netnode.
                              Only non-zero elememts of arrays are kept in 
                              the database. Arrays are indexed by 32-bit
                              indexes. You can keep anything in one element of
                              the array. Size of one element is limited
                              by MAXSPECSIZE. For example, for could have an
                              array of address that are patched by the user:

                              <address> : <old_value_of_byte>

                              The array will be empty at the start and will
                              grow as the user will patch the input file.
                              There are 2 predefined arrays:

                                - strings       (supval)
                                - longs         (altval)

                              The arrays don't need to be declared or created
                              specially. They implicitly exist in each node.
                              To save something to an array simply write
                              to an array element (altset or supset functions)
                          Each netlink has its type and name. A netlink between
                          two netnodes can have a text associated with
                          it (linkspec). BTW, some days age comments were kept
                          as netlink descriptions:
                            assume the user specified a comment to
                            address 1234 as "This is a comment"
                            This comment is saved as a netlink of type "comment"
                            from netnode number 1234 to netnode named "root"
                            This netlink has a associated text "This is a comment".
                        If you need to create a data structure that doesn't
                        be kept in the stack (never use heap in IDP modules)
                        you may use netnodes to save it. There are no
                        limitations on the size or number of netnode arrays.
NALT     HPP            some predefined netnode array indexes used by the
                        kernel. these functions should not be used directly.
OFFSET   HPP            functions that work with 'offset's.
PRO      H              some system-indenepdent functions.
                        You must use these function instead of functions
                        from stdlib.h
QUEUE    HPP            queue of problems.
SEGMENT  HPP            class segment_t and related functions.
SRAREA   HPP            class segreg_t. If your processor doesn't have
                        segment registers, you don't need this file.
SYNTAX   HPP            IDC files. Highest level functions.
EXPR     HPP            IDC language functions.
UA       HPP            This header file describes insn_t structure called
                        cmd: this structure keeps a disassembled instruction
                        in the internal form. Also, you will find here
                        helper functions to create output lines etc.
VA       HPP            Virtual array. Used by other parts of IDA.
                        IDP module don't use it directly.
VM       HPP            Virtual memory. Used by other parts of IDA.
                        IDP module don't use it directly.
XREF     HPP            cross-references.


-----------------------------------------------

        Libraries
        
IDA      LIB    - import library with all functions exported from the kernel

        There are 3 different versions of this file, one for each platforn.

-----------------------------------------------

        Description of the IDP example

     The IDP module disassembles an instruction in several steps:
        - analysis (decoding)           file ana.cpp
        - emulation                     file amu.cpp
        - output                        file out.cpp

     The analyser (ana.cpp) should be fast and simple: just decode an
     instruction and fill the Cmd structure. The analyser will always be called
     before calling emulator and output functions. If the current address
     can't contain an instruction, it should return 0. Otherwise, it returns
     the length of the instruction in bytes.
     
     The emulator and outputter should avoid accessing to the program bytes.
     The emulator tries to keep track of
     register contents, creates cross-references, plans to disassemble
     subsequent instructions etc.

     The outputter produces a line (or lines) that will be displayed on
     the screen.
     It generates only essential part of the line: line prefix, comments,
     cross-references will be generated by the kernel itself.

MAKEFILE        - makefile for a processor module
                  The DESCRIPTION line
                  should contain names of processors handled by this IDP
                  module, separated by colons. The first name is description
                  of whole module (not a processor name).
STUB            - MSDOS stub for IDP (for 32-bit modules)
ANA      CPP    - analysis of an instruction: fills Cmd structure.
EMU      CPP    - emulation: creates xrefs, plans to analyse subsequent
                  instructions
INS      CPP    - table of instructions.
OUT      CPP    - generate source lines.
REG      CPP    - description of processor, assemblers, and notify() function.
                  This function is called when certain events occur. You
                  may want to have some additional processing of those events.
IDP      DEF    - IDP description for the linker.
I51      HPP    - local header file. you may have another header file for
                  you module.
INS      HPP    - list of instructions.

-----------------------------------------------

        And finally:

  I'm sure you'll have questions - don't hesitate to contact me,
  I'll be happy to answer.

  I recommend to study the example, compile and run it.

  Limitations on IDP:
        - never use malloc/free etc. if you think that you really
          need to use dynamic memory, use qalloc/qfree provided in pro.h
        - never use functions from stdio.h. Use functions from fpro.h
        - name of processor description structure must be LPH

  Usually I write a new IDP module in the following way:
        - first I create INS.CPP and INS.HPP files
        - write the analyser ana.cpp
        - then outputter
        - and emulator (you can start with an almost empty emulator)
        - and describe processor, assembler, write notify() function

  Naturally, it is easier to copy and to modify example files, don't write
  your own files from the scratch.
    
  Debugging:
  You can use debug print functions: deb(), msg(), warning():
        deb() - display a line in the messages window if -z command
                line switch is specified. You may use debug ids
                IDA_DEBUG_IDP, IDA_DEBUG_LDR, IDA_DEBUG_PLUGIN
        msg() - display a line in the messages window
        warning() - display a dialog box with the message
        
  To stop in the debugger when the module is loaded, you may use
  __emit__(0xCC) construct in the module initialization code.
        
  BTW, you can save all lines appearing in the messages window to a file.
  Just set an enviroment variable:

        set IDALOG=ida.log

  I always have this variable set, it is a great help in the development...

-----------------------------------------------------
